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Cohesin is implicated in establishing tissue-specific DNA loops that target enhancers to promoters, and also localizes to 
sites bound by the insulator protein CTCF, which blocics enhancer-promoter communication. However, cohesin-associ- 
ated interactions have not been characterized on a genome-wide scale. Here we performed chromatin interaction analysis 
with paired-end tag sequencing [ChlA-PET] of the cohesin subunit SMCIA in developing mouse limb. We identified 2264 
SMCIA interactions, of which 1491 [65%) involved sites co-occupied by CTCF. SMCIA participates in tissue-specific en- 
hancer-promoter interactions and interactions that demarcate regions of correlated regulatory output, in contrast to 
previous studies, we also identified interactions between promoters and distal sites that are maintained in multiple tissues 
but are poised in embryonic stem cells and resolve to tissue-specific activated or repressed chromatin states in the mouse 
embryo. Our results reveal the diversity of cohesin-associated interactions in the genome and highlight their role in 
establishing the regulatory architecture of development. 



[Supplemental material is available for this article.] 

Mammalian development requires the precise spatial and tempo- 
ral control of gene expression. Much of this regulatory information 
is encoded in thousands of c/s-acting elements that are distributed 
across the genome, often at great distances from their target genes 
(Visel et al. 2007, 2009; Blow et al. 2010; Cotney et al. 2012). The 
mechanisms that connect ds-regulatory elements to their specific 
targets and that prevent them from inappropriately influencing 
other genes are not well defined. Recent studies suggest cohesin 
stabilizes DNA loops between distant-acting enhancers and their 
target promoters (Kagey et al. 2010). Cohesin is a ring-shaped 
complex consisting of the core subunits SMCIA, SMC3, SCCl (also 
known as RAD21), and SA1/SA2 (Nasmyth and Haering 2009). 
Although cohesin does not bind DNA directly, it colocalizes with 
tissue-specific transcription factors on chromatin (Schmidt et al. 
2010; Nitzsche et al. 2011) and is thought to stabilize binding of 
transcription factors at enhancers (Faure et al. 2012). In embryonic 
stem cells, cohesin shows cell-type-specific binding at enhancers 
and promoters that engage in cell-type-specific interactions, and 
knockdown of cohesin results in aberrant gene expression and loss 
of pluripotency (Kagey et al. 2010). These studies suggest tissue- 
specific gene activation is the result of tissue-specific cohesin- 
mediated DNA looping events. 

Additionally, cohesin has been shown to associate with the 
insulator factor CTCF (Rubio et al. 2008), which targets cohesin to 
specific sites in the genome (Parelho et al. 2008; Stedman et al. 
2008; Wendt et al. 2008). Cohesin is required for the enhancer- 
blocking functions of CTCF binding (Parelho et al. 2008; Wendt 
et al. 2008). CTCF also establishes chromatin barriers to prevent 
the spread of heterochromatin (Cuddapah et al. 2009; Kim et al. 
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2011) . In embryonic stem (ES) cells, chromatin loops mediated 
by CTCF interactions show correlated patterns of active or re- 
pressed histone modifications contained or excluded by the loop 
(Handoko et al. 2011). CTCF shows largely invariant binding 
patterns across tissues (Kim et al. 2007; Jothi et al. 2008) and, in 
conjunction with cohesin, may establish constitutive chromatin 
topologies in the nucleus (Dixon et al. 2012; Nora et al. 2012). 

Despite these findings, global insight into the role of cohesin 
in gene regulation remains limited because cohesin-mediated in- 
teractions have yet to be mapped at a genome- wide scale. Here, we 
use chromatin interaction analysis with paired-end tag sequencing 
(ChlA-PET) to detect putative regulatory interactions involving the 
cohesin subunit SMCIA in the embryonic mouse limb. The limb is 
particularly well suited for this purpose. Distant-acting enhancers 
are essential for limb development (Lettice et al. 2002; Amano et al. 
2009). Moreover, a large number of distant-acting enhancers in the 
limb have been experimentally characterized by chromatin map- 
ping and mouse transgenic assays, providing a functional basis 
for interpreting cohesin-mediated interactions (Visel et al. 2007, 
2009; Cotney et al. 2012). Previous ChlA-PET studies have been 
performed in mouse and human cell culture to capture inter- 
actions involving transcription factor estrogen receptor-a, CTCF, 
and RNAPII (Fullwood et al. 2009; Handoko et al. 2011; Li et al. 

2012) . However, cohesin is recruited to both insulators and en- 
hancers, suggesting it is involved in diverse regulatory inter- 
actions (Kagey et al. 2010; Kim et al. 2011; Majumder and Boss 
2011; Seitan et al. 2011; Quo et al. 2012). Our ChlA-PET analysis 
of cohesin-associated interactions in developing limb revealed 
tissue-specific enhancer-promoter interactions, as well as in- 
teractions involving both cohesin and CTCF that potentially 
establish constitutive chromatin domains across tissues. Sur- 
prisingly, we also identified interactions that are maintained in 
multiple tissues between promoters and distal regulatory ele- 
ments that show tissue-specific activation or repression during 
development. 
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Results 

Cohesin occupies developmental enhancers 

We first identified cohesin binding sites using ChlP-seq analysis of 
the cohesin subunit SMCIA in mouse embryonic (El 1.5) limb bud. 
To functionally annotate SMCIA binding sites, we also mapped 
CTCF sites and three histone modifications: H3K27ac, which 
marks active promoters and enhancers; H3K4me2; which marks 
both active and inactive enhancers; and H3K27me3, which marks 
sites repressed by the Polycomb repressive complex PRC2 (Sup- 
plemental Table SI; Heintzman et al. 2009; Creyghton et al. 2010; 
Ernst et al. 2011). We identified 41,114 SMClAbinding sites, most 
of which are located at intergenic or intronic regions (68%) (Fig. 
lA). CTCF is present at 42% of intergenic or intronic SMCIA sites, 
consistent with previous studies (Schmidt et al. 2010; Faure et al. 
2012). SMCIA sites that do not recruit CTCF show enrichment for 
H3K27ac and H3K4me2, suggesting they include enhancers (Sup- 
plemental Fig. SI A; Cotney et al. 2012). These sites also enrich for 
experimentally validated limb enhancers and limb-specific subsets 
of the enhancer-associated factor EP300 and H3K27ac (Fig. 1B,C; 
Supplemental Fig. SIB; Visel et al. 2009; Cotney et al. 2012) and are 



strongly associated with genes implicated in embryonic morpho- 
genesis and limb development (Supplemental Fig. SIC; McLean 
et al. 2010). We did not detect significant evidence of SMCIA 
binding at 35% of known limb enhancers (Supplemental Fig. SIB). 
These maybe false negatives in our ChlP-seq experiments, especially 
for enhancers active in a restricted population of cells in the limb. 
Alternatively, it may indicate a subset of enhancers do not recruit 
SMCIA at sufficient levels for detection. The dip in H3K27ac signal 
at known enhancers and EP300 sites co-occupied by cohesin sug- 
gests that cohesin binding displaces nucleosomes, similar to binding 
of a transcription factor (Fig. 1C,D; Cotney et al. 2012; Rada-Iglesias 
et al. 2012). Although most putative SMClA-bound enhancers do 
not appear to recruit CTCF, there are a small number of regions that 
are marked by SMCIA, CTCF, and H3K27ac or H3K4me2 (1626 
and 2369, respectively) (Fig. IE; Supplemental Fig. SID). 

Discovery and analysis of cohesin-associated interactions 

To identify chromatin interactions associated with SMCIA bound 
sites, we used chromatin interaction analysis with paired-end se- 
quencing (ChlA-PET) in developing limb. From two ChlA-PET 
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Figure 1 . Cohesin binding in embryonic El 1 .5 limb bud. (A) Classification of SMC1 A ChlP-seq peaks, partitioned by CTCF co-occupancy. (B) Intergenic 
and intronic SMCl A sites lacking CTCF (light blue bars) are enriched for limb enhancers in the VISTA Enhancer Browser (Visel et al. 2007) ([*] Fisher exact 
test P-value = 0.001 ) and putative enhancer marks, including the coactivator EP300 (Visel et al. 2009) and El 1 .5 limb-specific H3K27ac marking (Cotney 
et al. 201 2) ([**] Fisher exact test P-value < 2 x 1 0^ compared to sites co-occupied by CTCF (dark blue bars). (C) SMCl A and H3K27ac ChlP-seq signal 
profiles at a known limb enhancer, VISTA hs1491 (Visel et al. 2007). (D) SMCl A (black) and H3K27ac (green) normalized ChlP-seq signal aggregated at 
intergenic or intronic limb EP300 sites (left; n = 361 3) (Visel et al. 2009) and known VISTA limb enhancers (right; n=^65) (Visel et al. 2007). (f) Overlap of 
intergenic and intronic sites occupied by SMCIA, CTCF, and H3K27ac. 
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libraries, we generated 338,907,326 read pairs and identified 
28,320 intrachromosomal ligation products, using a minimum 
size limit of 5 kb to exclude self-ligation events (Supplemental 
Table S2). To be considered further, we required each interaction to 
show SMCIA binding at one or both anchors. We also limited our 
study to interactions spanning <1 Mb, as the number of observed 
interactions at or below this length significantly exceeds the num- 
ber of interactions expected from stochastic interaction events or 
random ligation (false discovery rate [FDR] < 0.10) (Supplemental 
Fig. S2). Using this approach, we identified 2264 SMCIA in- 
teractions (Supplemental Table S3). We captured two major classes 
of interactions: those between two intergenic or intronic regions 
(1330), of which 68% are co-occupied by CTCF at either or both 
anchors; and interactions between promoters and distal intergenic 
or intronic regions (680), of which 56% are co-occupied by CTCF 
(Fig. 2A). The distribution of ChlA-PET interactions on chromo- 
some 2, relative to chromatin modification patterns and gene 
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Figure 2. Cohesin interactions in the genome. (A) Classification of 
SIVICIA ChlA-PET interactions, partitioned by CTCF co-occupancy. (Int) 
Intergenic or intronic sites; (Pr) promoter; (Ex) exonic. (B) Circos map of 
ChlA-PET interactions on chromosome 2. The outermost to innermost 
tracks are chromosome 2 (dark gray; centromere location indicated in light 
gray), ChlA-PET interactions (Int-Int interactions shown in blue, Pr-Int in- 
teractions in orange, and all others in gray), CTCF peaks (gray), z-scores of 
log2 transformed RPKM values of RNA-seq (blue), H3K27ac (green), and 
H3K27me3 (red). An expanded view of a 20-Mb region on chromosome 2 
is shown with the location of the Snail locus (Fig. 3A). 



expression levels, is shown in Figure 2B. We captured previously 
unknown interactions involving known developmental regula- 
tors, including Snail (Figs. 2B, 3 A). However, our ChlA-PET experi- 
ments likely suffer from a high false negative rate, due to regulatory 
heterogeneity in the limb bud and compounded inefficiencies in 
Chip enrichment and subsequent ligation of interacting sites 
(Supplemental Note). 

The substantial overlap we observed between cohesin-asso- 
ciated interactions and CTCF binding suggests these interactions 
may partition chromatin into discrete domains (Handoko et al. 
2011). Sites located within the interaction we detected at Snail 
show correlated presence or absence of H3K27ac enrichment in 
El 1.5 limb and E14.5 cortex, respectively (Fig. 3A). Analysis of all 
SMCIA interactions suggest they show highly correlated patterns 
of H3K27ac marking across 19 embryonic and adult mouse tissues 
(Fig. 3B; Shen et al. 2012). H3K27me3 marking is also highly cor- 
related across six tissues (Fig. 3B; ENCODE Project Consortium 
2011). The highest correlations of H3K27ac and H3K27me3 occur 
within SMCIA loops compared to flanking intervals (Wilcoxon 
P- value < 2 X 10~^^) (Fig. 3C). Additionally, the expression levels of 
genes contained within SMCIA loops are significantly correlated 
across tissues ( Wilcoxon P- value < 2 X 10~^^) (Fig. 3C). These results 
suggest SMCIA interactions establish discrete domains with corre- 
lated histone modification states and gene expression. SMCIA in- 
teractions still significantly partition H3K27ac and H3K27me3 
when we exclude interactions involving a CTCF binding event 
or motif from the analysis (Wilcoxon P- value < 2 X 10~^^) (Sup- 
plemental Fig. S3A,B). If promoter-distal site interactions are con- 
sidered separately from all interactions, they show correlated 
chromatin marking within the loop, suggesting that promoter- 
distal site interactions may also demarcate regulatory domains 
(Supplemental Fig. S3C). 

We next considered putative enhancer-promoter interactions. 
One such interaction occurs at Pitxl, which is required for hindlimb 
development (Lanctot et al. 1999). Pitxl shows hindlimb-specific 
expression at El 1.5 and is looped to a previously uncharacterized 
distal site 133 kb away that has hindlimb-specific H3K27ac marking 
(Cotney et al. 2012). In forelimb and embryonic cortex this distal 
site is marked by H3K27me3 (Fig. 4A). Chromosome conformation 
capture (3C) analysis suggests this interaction is specific to the 
embryonic hindlimb (Fig. 4B; Supplemental Fig. S4A). The distal 
interacting site is bound by CTCF in limb and cortex (Fig. 4A), 
suggesting that constitutive CTCF binding events participate in 
tissue-specific enhancer-promoter interactions. For all SMCIA 
interactions we detected that involve a promoter and distal site, 
the interacting promoters show significantly higher gene ex- 
pression compared to genes contained within the loop (Wilcoxon 
P-value < 2 X 10"^^) (Fig. 4C). The distal sites show enrichment 
for H3K27ac and H3K4me2 (Fig. 4D). These observations suggest 
a subset of SMCIA interactions between promoters and distal sites 
result in tissue-specific transcriptional activation. 

Our results also suggest cohesin is involved in promoter- 
promoter and enhancer-enhancer interactions. We captured a 
known interaction between the Irx3 and IrxS promoters (Tena et al. 
201 1), which are both bound by SMCIA and may facilitate shared 
transcriptional regulation (Supplemental Fig. S4C). Additionally, 
we identify previously characterized interactions between en- 
hancers at the HoxD locus (Supplemental Fig. S4B; Gonzalez et al. 
2007; Montavon et al. 2011). Overall, we identified 258 SMCIA 
interactions in limb that involve two H3K27ac-marked intergenic 
or intronic sites. These sites may participate in larger "regulatory 
archipelagos," which have been shown to bring together multiple 
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regulatory elements to robustly drive transcription of target genes 
(Montavon et al. 2011). We also detected 41 interactions that in- 
volve a known developmental enhancer, only six of which are 



enhancer-promoter interactions (Supplemental Table S4; Visel et al. 
2007). The remaining interactions occur between known enhancers 
and distal sites that are not promoters; 19 of these sites are marked 
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Figure 3. Cohesin interactions partition chromatin states and gene expression levels. (A) SMCl A ChlA-PET interaction at the Snail locus in El 1 .5 limb, 
with associated H3K27ac ChlP-seq and RNA-seq profiles in limb and El 4.5 cortex (Ayoub etal. 201 1). The upstream and downstream flanking regions are 
also shown (+/- one loop distance, L). (B, left) Mean pairwise Spearman correlations of H3K27ac signal across 1 9 cell types (Shen et al. 201 2) using sites 
binned across all SMCl A interactions, spanning from one loop distance upstream (L) to one loop distance downstream (/.). (Right) Mean pairwise 
Spearman correlations of H3K27me3 signal across six cell types (The ENCODE Project Consortium 201 1 ) across all SMCl A interactions. (C, left) Distribution of 
pairwise Spearman correlations of H3K27ac and H3K27me3 signal within SMCl A interactions (within loop) or other pairings in the 3L interval (across loop 
boundary) ([*] Wilcoxon P-value < 2 x 1 0~^ ^). (Right) Distribution of pairwise Spearman correlations of gene expression for SMCl A interactions containing 
two or more genes (within loop) compared to other pairings in the 3L interval (across loop boundary) ([*] Wilcoxon P-value < 2 x 1 0^^). 
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Figure 4. Cohesin participates in enhancer-promoter interactions. (A) SMCl A ChlA-PET interaction between Pitxl and a distal site 1 33 kb upstream, 
bound by CTCF and SMCl A. Although El 1 .5 forelimb and hindlimb tissue were combined for ChIP, it has previously been shown that both Pitxl and the 
distal site are marked only by H3K27ac in hindlimb and only by H3K27me3 in forelimb (Cotney et al. 201 2). CTCF binding sites in El 4.5 cortex are also 
shown (Shen et al. 201 2). (B) 3C analysis of interactions at the Pitxl locus in El 1 .5 forelimb, El 1 .5 hindlimb, and El 4.5 cortex, using an anchor primer 
located in the gene (dashed line) and primers tiled across the locus. Interactions are normalized to a crosslinking control at Ercc3. Each data point represents 
average interaction frequency and error bars represent standard error from three independent qPCR reactions (see Supplemental Fig. S4Afor biological 
replicate). (C) Promoters within 2 kb of an interaction anchor that loop to a distal site show significantly higher expression than promoters contained 
within the loop ([*] Wilcoxon P-value < 2 x 10"^^). (D) El 1 .5 limb H3K27ac (green) and H3K4me2 (blue) normalized ChlP-seq signal aggregated at all 
distal sites looping to a promoter. 



by H3K27ac in limb, suggesting potential enhancer-enhancer in- 
teractions. Notably, 22 of the known enhancers in our interaction 
set are only active in nonlimb embryonic tissues, suggesting in- 
active regulatory elements may also participate in long-range 
looping events. 

A subset of cohesin interactions are present in multiple tissues 
and involve active, repressed, and poised loci 

The correlated patterns of chromatin modification and gene ex- 
pression we observe, coupled with the interactions we detected 
that involve enhancers not active in limb, suggest a subset of 
SMCIA interactions may be maintained across multiple tissues. 
One such interaction occurs at Wnt7a, a regulator of dorsal-ventral 
patterning (Parr and McMahon 1995). In the limb, Wnt7a ex- 
pression is restricted to the dorsal ectoderm, whereas it is not 
expressed in limb bud mesenchyme (Parr et al. 1993). Since the 
mesenchyme comprises most of the limb bud at the time point we 
interrogated, most of our ChlP-seq signal and ChlA-PET inter- 
actions are likely derived from it rather than ectoderm. However, in 



the limb bud we find that the Wnt7a promoter interacts with a distal 
site 125 kb upstream. Both the promoter and the distal site are 
marked by H3K27me3 repression and bound by CTCF in limb 
(Fig. 5A). In embryonic cortex, both the promoter and distal site are 
marked by H3K27ac and Wnt7a is highly expressed (Parr et al. 1993; 
Ayoub et al. 2011). 3C analysis detects the interaction in both limb 
and cortex (Fig. 5B; Supplemental Fig. S5B), and the correlated 
H3K27ac and H3K27me3 chromatin states suggest the interaction is 
maintained across multiple tissues (Supplemental Fig. S5A). These 
results support a model in which the interaction we detect in limb 
serves to recruit tissue-specific regulatory factors that repress Wnt7a 
in the limb bud mesenchyme. Conversely, the same interaction at 
Wnt7a in embryonic cortex may recruit factors that enhance tran- 
scription. To address the prevalence of interactions showing differ- 
ent chromatin states in limb and cortex, we examined patterns of 
H3K27ac and H3K27me3 in both tissues at SMCIA limb in- 
teractions. We find 12 of the 61 promoter-distal site interactions 
marked at both anchors by H3K27me3 in limb show H3K27ac in 
cortex, and 27 of the 196 promoter-distal site interactions marked at 
both anchors H3K27ac in limb show H3K27me3 in cortex. These 



1228 Genome Research 



www.genome.org 



Cohesin-associated interactions in the genome 



50 kb 



Fbln2 ^^).n. | ):.^.. ||||||| | ||||> 

I I 



Wnt7a i- 



■ ChlA-PET 

J Bivalent _ 

■ ChlA-PET" 
■ SMC1A 

I CTCF 

_ RNA 



1 




H3K27ac 



H3K4me2 



H3K27me3 



J L RNA 



H3K27ac 



H3K4me2 



- H3K27me3 



embryonic 
limb 



embryonic 
cortex 



distal site 


EcoRI 




2.0^ 




> 
o 




§ 1.5- 




1.0- 

c 
.2 


J 


S 0.5- 
o 




£ 0 





Wnt7a 



-•— embryonic limb 
-h- embryonic cortex 




91200000 91250000 91300000 91350000 
Chromosomal position (chr6) 



Bivalent Limb ac Limb meS 



ES Limb Cortex 



6430527G18Rik 




Bbc3 




Iqgap2 




Ski 




2700081 01 5Rik 




Col23a1 




Collal 




mKIAA4159 




Nfatcl 




Snail 




Cyp26b1 




Cd24a 




Ust 




Tgfb3 




Plk2 




Tribi 




Scubel 




Zfp36l2 




Pisd-ps2 




Ltbpl 




Lgr6 




Adamtsl5 




Tcfap2c 




TbxIS 




Nkx2-3 




Rhov 




1700101 EOlRik 




Irx5 




4933436C20Rik 




Ndufa4l2 




Nrg2 




Pvalb 




Kcnk15 




Aldh1a2 




Syt7 




Mafb 




Rini 




Sdc3 




AtohS 




okadin 




Sema3f 




Hoxd12 




Pcdhaci 




Nes 




Dpysl5 




Egri 




Hoxc5 




Mir615 




Gm12830 




KlflO 




Lemdl 




Mir203 




Bahdl 




Ajap1 




Wnt7a 




Kcna6 




Zfp536 




AK086341 




Ephbl 




Lingol 





III 



Figure 5. Tissue-dependent activation or repression of poised cohesin interactions. (A) SMCl A/CTCF interaction at Wnt7a in ES cells (Handoko et al. 
201 1 ) and limb. Bivalent sites (H3K4me3 and H3K27me3) in ES cells are shown in orange (Mikkelsen et al. 2007). RNA-seq and H3K27ac in El 1 .5 limb and 
El 4.5 cortex, and H3K27me3 in El 1 .5 limb and El 1 .5 cortex are shown (El 4.5 cortex RNA obtained from Ayoub et al. (201 1 ). (S) 3C analysis of El 1 .5 limb 
and El 4.5 cortex interactions between the Wnt7a promoter (dashed line) and distal sites across the locus, normalized to a crosslinking control at Ercc3. 
Each data point represents average interaction frequency and error bars represent standard error from three independent qPCR reactions (see Supple- 
mental Fig. S5B for biological replicate). (C) Diagram of a subset of bivalent ES cell promoters involved in SMCl A or CTCF ChlA-PET interactions (Handoko 
et al. 201 1). Only interactions with concordant chromatin states at both the promoter and distal site were considered. (I) Interactions resolving to active 
H3K27ac in limb; (II) interactions resolving to repressive H3K27me3 in limb; (III) interactions resolving to opposite chromatin states in limb and cortex. 
Gene names in red are involved in interactions detected in both ES cells (Handoko et al. 201 1 ) and limb. The full diagram can be found in Supplemental 
Figure S5. (D) Gene expression of promoters involved in poised, activated, or repressed interactions. Promoters of H3K27ac marked limb interactions show 
significantly higher expression than bivalently marked promoters in ES cells ([*] Wilcoxon P-value = 1.6 x 10"^^) (Shen et al. 2012) and promoters in 
H3K27me3 marked interactions ([**] Wilcoxon P-value = 6.1 x 10^°). 



results suggest that a subset of promoter-regulatory element in- 
teractions may be maintained in multiple tissues and recruit tissue- 
specific regulatory factors that serve to activate or repress transcription 
of their target gene depending on the tissue context. 

To identify additional SMCl A interactions present in multi- 
ple tissues, we compared our limb SMCl A interactions to a set 
of CTCF-mediated interactions previously identified in ES cells 
(Handoko et al. 2011). We observe 52 interactions that are present 



in both data sets, including the Wnt7a interaction. In ES cells, the 
Wnt7a promoter is poised, showing bivalent H3K4me3 and 
H3K27me3 marking (Fig. 5 A). The distal site we identified that 
interacts with Wnt7a also exhibits a bivalent chromatin state in ES 
cells (Mikkelsen et al. 2007). Upon differentiation, the bivalent 
marks at both the promoter and the distal site resolve to H3K27me3 
or H3K27ac in embryonic limb or cortex, respectively, while also 
maintaining H3K4me2, which marks both active and inactive 
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enhancers and promoters (Fig. 5 A). The Snail interaction we iden- 
tified in limb is also present in ES cells (Handoko et al. 201 1), and the 
Snail promoter shows bivalent marking but resolves to H3K27ac 
and is transcriptionally active in limb (Mikkelsen et al. 2007). 

To investigate whether maintenance of a consistent regula- 
tory topology in both bivalent and resolved states may occur at 
other loci, we characterized all interactions that recruit CTCF in ES 
cells and/or embryonic limb and also involve bivalent promoters 
in ES cells (Mikkelsen et al. 2007). We found 83 bivalent in- 
teractions that resolve to an active or repressed chromatin state in 
limb. Of these, 40 resolve to H3K27ac in limb at the promoter and 
distal site, and 39 resolve to H3K27me3 (Fig. 5C; Supplemental 
Fig. S5C). The remaining four interactions are marked by both 
H3K27ac and H3K27me3 at the promoter and distal site, suggest- 
ing both elements exhibit a gradient of activation or repression in 
the limb bud (Cotney et al. 2012). Genes participating in these 83 
interactions show low expression in mouse ES cells (Fig. 5D). In 
limb, genes that resolve to H3K27ac marking show significantly 
higher expression than in ES cells (Wilcoxon P- value = 1.69 X 
10~^^), whereas the genes that resolve to H3K27me3 show no 
significant change in expression (Fig. 5D). 

A previous study of the protocadherin alpha cluster suggests 
that removal of an enhancer participating in one such bivalent 
interaction results in both tissue-specific aberrant activation and 
reduced expression of the target gene. We detect an interaction in 
limb between the Pcdhacl promoter and the HS5-1 enhancer, both 
marked by H3K27me3 (Supplemental Fig. S6). This interaction also 
occurs in a cultured neural cell line that expresses Pcdhacl (Guo 
et al. 2012). The promoter is bivalent in ES cells (Mikkelsen et al. 
2007). Previous studies indicate that HS5-1 is necessary for driving 
robust expression of protocadherin alpha genes, and deletion of 
HS5-1 not only results in threefold reduction of Pcdhacl expression 
in whole brain, but also a fivefold increase of Pcdhacl transcripts in 
kidney (Kehayova et al. 2011). 

Discussion 

Using ChlA-PET analysis of SMCIA, we obtained a direct view of 
cohesin-associated topology in the genome. Our results suggest 
that cohesin interactions facilitate tissue-specific regulatory out- 
comes through several mechanisms. Con- 
sistent with previous studies of individual 
loci, we find that cohesin is involved 
in tissue-specific looping between pro- 
moters and enhancers. In these cases, 
tissue-specific activation of gene expres- 
sion is likely to depend on tissue-specific 
interaction events. However, cohesin is 
also associated with interactions between 
distal sites and promoters that are main- 
tained across tissues, but show tissue- 
specific chromatin signatures and gene 
expression. Both the promoters and the 
distal sites in these interactions exhibit 
tissue-specific active or repressed chro- 
matin states, suggesting in these cases 
that tissue-specific regulation is achieved 
by altering the activation state of a con- 
stitutive chromatin topology. 

These "stable'' cohesin-associated in- 
teractions appear to provide a mechanism 
for establishing tissue-specific promoter 



activation and repression through interaction with the same distal 
site. The interaction we observed at Wnt7a involves a distal site 
marked by H3K27ac in cortex and H3K27me3 in limb, suggesting 
it may act as an enhancer of Wnt7a expression in some tissue 
contexts and as a repressor in others. The Wnt7a promoter also 
interacts with the same distal site in embryonic stem cells, where it 
exhibits a bivalent chromatin state. This suggests the same in- 
teraction event may maintain a poised state in ES cells and serve to 
activate or repress the target gene in differentiated tissues (Fig. 6). 
One such distal site with dual functions is the HS5-1 enhancer at 
the Pcdhacl locus: Loss of the enhancer results in reduced Pcdhacl 
expression in brain and increased expression in tissues that nor- 
mally exhibit very low levels of Pcdhacl (Kehayova et al. 2011). 

The mechanisms by which these stable interactions produce 
tissue-specific transcriptional outcomes remain to be determined. 
In one potential model, stable interactions are maintained across 
many tissues by cohesin and CTCF, irrespective of their tran- 
scriptional output. Tissue-specific transcription factors would then 
serve to activate or repress this constitutive regulatory topology. 
Alternatively, apparent "stable" interactions may be indepen- 
dently specified in different tissues by CTCF in conjunction with 
cohesin, leading to activation or repression depending on the tis- 
sue context. In addition, although we explicitly focused on po- 
tentially stable cohesin-associated interactions that involve CTCF 
in our analysis, other factors besides CTCF may also be sufficient. 
Conditional deletion of Cfcffrom the mouse embryonic limb has 
been shown to result in small changes in overall gene expression. 
However, critical limb development genes are down-regulated, 
including Shh, Fgf4, Greml, and Jagl, whereas proapoptotic PZ7c3 is 
derepressed, potentially contributing to the degeneration of distal 
limb structures following loss of Ctcf (Soshnikova et al. 2010). 
Therefore, CTCF may only be required at a subset of stable inter- 
actions, or may not be necessary for the maintenance of previously 
established interactions. 

Our results also suggest that cohesin generally establishes 
a stable chromatin topology in the nucleus, in addition to the 
specific examples we discuss here. Considered collectively, the 
cohesin-associated interactions we identified exhibit correlated 
H3K27ac and H3K27me3 chromatin states and gene expression 
across tissues. This is consistent with previous Hi-C and 5C studies 
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Figure 6. Distal regulatory sites can act as both enhancers and repressors in a tissue-dependent 
context. A model of tissue-specific gene regulation obtained via stable chromatin interactions. Bivalent 
promoters in embryonic stem cells are held in a poised, looped conformation with a distal site. Upon 
differentiation, the distal site is either activated by tissue-specific regulatory factors, leading to de- 
position H3K27ac and gene transcription, or repressed by recruitment of PRC2, as indicated by 
H3K27me3 marking. 
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that identified constitutive topological domains maintained across 
tissues and species (Lieberman-Aiden et al. 2009; Dixon et al. 2012; 
Nora et al. 2012). Stable cohesin-associated interactions may serve 
as a constitutive chromatin scaffold that delimits the activity of 
tissue-specific regulatory elements. For example, the stable inter- 
action at Wnt7a encompasses several putative cortex enhancers 
identified by H3K27ac, and potentially restricts the activity of 
these enhancers to the Wnt7a promoter while excluding outside 
enhancers from influencing Wnt7a expression. 

Investigating these hypotheses will require functional studies 
of Wnt7a and other loci that exhibit stable interaction events. For 
example, transgenic analysis of bacterial artificial chromosomes 
(BACs) spanning the Wnt7a locus, from which the distal interacting 
site has been removed using recombineering, may determine 
whether the long-range interaction is required for spatiotemporal 
regulation of Wnt7a. To conclusively establish that the stable in- 
teraction at Wnt7a maintains the fidelity of Wnt7a expression in 
vivo will ultimately require removal of the distal site from the 
mouse genome directly using homologous recombination in ES 
cells and generation of knockout mice. Such models would also 
potentially reveal developmental phenotypes arising from de- 
stabilization of specific long-range chromatin topologies. 

Analyses of chromatin modification and transcription factor 
binding have produced two-dimensional regulatory maps of many 
mouse and human tissues, but lack the connectivity information 
required to assign regulatory elements to specific genes. Here, 
we obtained an initial, genome-wide view of three-dimensional 
cohesin-associated chromatin interactions. Our results highlight 
the diverse roles of cohesin in establishing chromatin topology 
and tissue-specific gene expression and provide insight into how 
regulatory functions are partitioned in the genome. 

Methods 

RNA-seq 

All animal work was performed in accordance with approved Yale 
lACUC protocols. C57BL/6J mouse forelimb and hindlimb buds 
from 24 El 1.5 embryos were dissected and total RNA extracted as 
described in Cotney et al. (2012). RNA-seq libraries were con- 
structed with Illumina TruSeq RNA Sample Preparation Kit and 
sequenced on an Illumina HiSeq2000 (2 X 7 5 -bp reads). RNA-seq 
reads from El 1.5 forelimb and hindlimb were pooled together and 
aligned to the mouse reference genome (mm9) using TopHat 
(vl.4.1) (Trapnell et al. 2009) with a known transcriptome index 
(UCSC Known Gene annotation; downloaded 11/9/2012) (Dreszer 
et al. 2012). E14.5 cortical plate RNA-seq data (Ayoub et al. 2011) 
was downloaded from GEO (mapped reads in big Wig format). 

ChiP-seq 

SMCIA, CTCF, H3K27ac, H3K4me2, and H3K27me3 ChlP-seq was 
performed as described in Cotney et al. (2012) with modification. 
The detailed protocol is provided in the Supplemental Methods. 
Purified ChIP DNA was prepared for Illumina sequencing with 
NEBNext ChlP-Seq Library Prep (NEB) with multiplex adaptors 
and sequencing primers. Libraries were sequenced on an Illumina 
HiSeq2000 (1 or 2 X 75-bp reads). ChlP-seq reads were aligned to 
the mouse reference genome (mm9) using Bowtie (vO.12.7) 
(Langmead et al. 2009). SMCIA and CTCF peaks were called using 
MACS (Zhang et al. 2008; Feng et al. 2011). For histone modifica- 
tions, peaks were called using a custom Perl script. Peak calling 
information is provided in the Supplemental Methods. Intergenic 



and intronic peaks were identified by filtering exons and regions 
within 1 kb from a transcription start site (TSS). ChlP-seq signal 
aggregation analyses of H3K27ac, H3K4me2, and H3K27me3 were 
performed as described in Cotney et al. (2012). 

ChlA-PET 

ChlA-PET was performed as described by Fullwood et al. (2010), 
with modification. Approximately 500 ixg of soluble chromatin 
from El 1.5 limb buds was immunoprecipitated in five parallel re- 
actions with Dynabeads Protein G beads (Life Technologies) bound 
to SMCIA antibody. Custom ChlA-PET ligation adaptors (see 
Supplemental Table S5 for oligo sequences) containing 3' T over- 
hangs were ligated onto chromatin (which had complementary A 
overhangs). Following immobilization of purified ChlA-PET DNA 
on Dynabeads M-280 Streptavidin beads (Life Technologies), beads 
were added directly to Phusion High-Fidelity PGR Master Mix 
(NEB) and PGR amplified with standard Illumina PE primers for 18 
cycles. Size-selected ChlA-PET libraries were sequenced on an Illu- 
mina HiSeq2000 (2 X 75-bp reads) using a modified denaturation 
protocol for low-concentration Illumina libraries (Quail et al. 2008). 
The detailed protocol is provided in the Supplemental Methods. 

ChlA-PET read alignment and interaction calling 

Paired-end reads were trimmed of ChlA-PET adaptors using 
Cutadapt (vO.9.4) and the following parameters: -a AGTTGGAT 
ACCTGCAGTACTAGTCAGTGGGCCC -m 1 7 -M 18 -O 33 (Martin 
2011). Trimmed mate pairs were aligned separately with ELAND 
(CASAVA vl.8.1) with the following parameters: -ub Y\*-bam. 
Only reads with MAPQ > 1 were retained using SAMtools (Li et al. 
2009). Aligned reads were paired with mates and were filtered for 
PCR duplicates (>1 paired-end read with same start positions of both 
mates). Paired-end reads were categorized into interchromosomal, 
intrachromosomal (distance between mates >5 kb), and self (dis- 
tance between mates <5 kb). Only intrachromosomal reads with 
mates <1 Mb apart and overlapping at least 1 SMCIA ChlP-seq 
peak by a minimum of 1 bp were considered for analysis (see 
"Simulation of ChlA-PET Size Distribution" for justification of the 
size restriction). Interactions in which both start positions were 
within 2 kb were joined and their midpoint used to specify their 
genomic location. The UCSC Known Gene and Known Alternative 
annotation (downloaded 11/9/2011) (Dreszer et al. 2012) was used 
to classify interactions. Interaction anchors were designated as 
a promoter if it was within 2 kb of a transcription start site. In- 
teraction anchors were designated as an exon if it overlapped an 
exon by at least 1 bp and was not previously assigned to the pro- 
moter category. All other anchors were considered intergenic or 
intronic. To be classified as a CTCF co-occupied interaction, either 
one or both anchors were required to be within 500 bp of a CTCF 
ChlP-seq peak in E10.5 (Cotney et al. 2012) or E11.5 mouse limb 
bud. For correlation analyses of H3K27ac or H3K27me3 at in- 
teractions with or without CTCF, we also designated interactions 
within 500 bp of a predicted CTCF motif as CTCF interactions. See 
the Supplemental Methods for more information regarding elic- 
iting and mapping the CTCF motif in the mouse genome. 

Simulation of ChlA-PET size distribution 

To determine a size range in our ChlA-PET data set in which au- 
thentic interactions are enriched over random ligation or sto- 
chastic collisions, we carried out 100 random ligation simulations. 
We randomly paired SMCIA ChlP-seq data from biological repli- 
cate 1 (reads were trimmed to 18 bp and each read was aligned 
separately from its pair with ELAND, as described for ChlA-PET) 
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using custom shell scripts and random number generation. The 
simulated ligations where then categorized into interchromosomal; 
intrachromosomal, and self. The intrachromosomal ligations were 
subsampled to equal the total number of observed ChlA-PET 
intrachromosomal interactions (n = 5469), and the resulting size 
distribution for each simulation was determined (Supplemental 
Fig. S2). We restricted our SMCIA ChlA-PET interaction set to 
those spanning <1 Mb to reduce the likelihood of spurious ligation 
products (FDR < 0.10). FDR was calculated by dividing the fre- 
quency of random ligations <1 Mb by the observed frequency. 

Chromosome conformation capture (3Q 

3C was performed as described by Miele and Dekker (2009) with 
modification. Approximately 100 El 1.5 limb buds or four E14.5 
cortices were dissected in cold PBS. The detailed protocol is pro- 
vided in the Supplemental Methods. 3C libraries were quantified 
by PicoGreen dsDNA assay (Invitrogen). Control 3C templates 
were generated from the following BACs: RP23-400A6 (Pitxl) and 
RP23-237E3 and RP23-446L3 {Wnt7a). Digestion and ligation ef- 
ficiency were assessed using qPCR with Power SYBR Green PGR 
Master Mix (Life Technologies). Between 5- and 100-ng template 
DNA was loaded per reaction, and each reaction was performed in 
triplicate for each biological replicate (see Supplemental Table S5 
for oligo sequences). The mean value for each ligation was nor- 
malized against an interaction at the Ercc3 locus for each experi- 
ment (Tena et al. 2011), except for the Pitxl 3C replicate (Supple- 
mental Fig. S3A), which was normalized by internal copy number 
(chr9.2_CTGF primers). 

Identification of SMCIA sites with known or putative enhancer 
function 

To identify SMCIA binding sites with known developmental en- 
hancer activity, we obtained coordinates for all known human and 
mouse positive enhancers in the VISTA Enhancer Browser and 
intersected with SMCIA peaks using BEDTools, requiring at least 
1-bp overlap (Visel et al. 2007; Quinlan and Hall 2010). To identify 
SMCIA binding sites overlapping limb EP300 binding sites, we used 
El 1.5 limb EP300 peak calls from Visel et al. (2009) and intersected 
as described above. To identify SMCIA binding sites with limb- 
specific H3K27ac marking, we used the limb-specific Ell .5 H3K27ac 
cluster (compared to mouse ES cells and neural progenitor cells) 
identified from Cotney et al. (2012) and filtered sites by overlapping 
(>1 bp) H3K27ac peaks. Aggregation of H3K27ac and SMCIA ChlP- 
seq signals was performed as described in Cotney et al. (2012) at 
EP300 sites and VISTA limb enhancers >2 kb from a TSS. 

Generation of Circos map 

The map of ChlA-PET interactions on chromosome 2 was gener- 
ated using the Circos software package (Krzywinski et al. 2009). 

Correlation analysis of gene expression and histone 
modifications 

El 1.5 limb RNA-seq and H3K27ac ChlP-seq generated in this study 
was analyzed in conjunction with data from 18 other mouse tis- 
sues or cell types (bone marrow, cerebellum, cortex, embryonic 
brain, adult heart, embryonic heart, adult liver, embryonic liver, 
intestine, kidney, lung, MEF, mESC, olfactory bulb, placenta, 
spleen, testis, thymus) reported by Shen et al. (2012). Data was 
downloaded from GEO (mapped reads in BAM format). Of the 19 
data sets reported, mouse limb El 4. 5 data was excluded due to its 
redundancy with the limb El 1.5 data generated by this study. For 



RNA-seq data, a composite gene model was generated by com- 
bining all annotated transcripts from UCSC Known Gene anno- 
tation (downloaded 11/9/2011) (Dreszer et al. 2012), and RPKM 
(reads per kilobase per million mapped reads) values were calculated 
using these models. RSEQtools was used to construct the gene 
models and compute RPKM values (Mortazavi et al. 2008; Habegger 
et al. 2011). H3K27me3 data from El 1.5 limb and E14.5 cortex 
generated in this study was analyzed in conjunction with data 
from four other mouse tissues or cell types (C2cl2, GlE, mESC, 
mNPC). C2cl2 and GlE data were downloaded from ENCODE 
(The ENCODE Project Consortium 2011) (mapped reads in BAM 
format). mESC and mNPC data were downloaded from GEO 
(Mikkelsen et al. 2007) (mapped reads in BAM format). For all his- 
tone modification data sets, only uniquely mapped reads were used 
for subsequent analyses. PGR duplicates were excluded. 

For the histone modification analysis of SMCIA interactions 
(shown in Fig. 3; Supplemental Fig. S3), the loop and equivalently 
sized flanking regions upstream and downstream were each binned 
into 41 windows as follows: Starting from each end, 20 windows 
were generated of length equal to 1/41 of the size of the loop, and 
a final window covering any remainder. As a result, 123 bins were 
generated for each three-loop-size region. Each bin was represented 
by a vector of its RPKM values for each histone mark across all tis- 
sues/cell types (19 for H3K27ac and six for H3K27me3), and pair- 
wise correlations between any two bins were calculated as the 
Spearman correlation coefficients of such vectors, using the "cor" 
function in R. To generate mean correlation coefficient heatmaps 
(shown in Fig. 3B; Supplemental Fig. S3A,B), the pairwise correla- 
tion coefficients for each bin were averaged across all loops. To 
statistically evaluate the correlation of histone modification sig- 
nals at interactions (Fig. 3C; Supplemental Fig. S3), we considered 
the original, nonaveraged correlation coefficients and compared 
pairwise comparisons of bins contained within the loop versus all 
other pairings of bins in the three-loop size regions. 

For the gene expression analysis of SMCIA interactions 
(shown in Fig. 3C), interactions were filtered using the following 
criteria: At least two genes were required to be present in the actual 
loop (995 loops passed filter), and at least one gene in either the 
upstream or downstream flanking region (973 loops passed filter). 
Pairwise Spearman correlation coefficients of RPKM values were 
calculated for each category: (1) all pairwise comparisons of genes 
within the loop; and (2) all other pairings. For each loop, values in 
each category were averaged, resulting in one value representing 
either within loop comparisons or all other pairings. 

For the analysis of the Wnt7a locus shown in Supplemental 
Figure S5A, we only considered the region chr6: 91123700- 
91361 700, which includes the ChlA-PET interactions at the Wnt7a 
locus and the adjacent Fbln2 locus. We used a 2-kb window in this 
analysis. Pairwise comparisons to calculate Spearman correlation 
coefficients of RPKM for H3K27ac and H3K27me3 were calculated 
as described above. 

Shared interactions between ES and Limb ChlA-PET 

To identify overlapping interactions between the limb SMCIA 
ChlA-PETand the ES cell CTCF ChlA-PET data reported in Handoko 
et al. (2011), we used the midpoint of each interaction anchor 
reported in Handoko et al. Interactions were considered shared be- 
tween data sets if anchors at both sides were within 10 kb of each 
other. 

Identification of bivalent chromatin interactions 

SMCIA ChlA-PET interactions generated in this study (2264) and 
CTCF ChlA-PET interactions <1 Mb in length (1787) from Handoko 
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et al. (2011) were merged into a single set. The midpoints of in- 
teraction anchors were calculated and extended +/- 2 kb. For the 
SMCIA ChlA-PET interactions, both the promoter and distal an- 
chors were required to overlap a CTCF binding site in limb, mouse 
ES cells, or El 4. 5 cortex (Shen et al. 2012). Bivalent promoters were 
identified in mouse ES cells by intersecting H3K4me3 and 
H3K27me3 peaks using BEDTools (Quinlan and Hall 2010), called 
using the same histone modification peak-calling method described 
in the Supplemental Methods. Putative bivalent interactions were 
defined as interactions between an intergenic or intronic site and 
a bivalent promoter. To identify interactions acquiring H3K27ac or 
H3K27me3 in limb or cortex, H3K27ac peaks in El 1.5 limb and 
E14.5 cortex or H3K27me3 peaks in El 1.5 limb and El 1.5 cortex 
were intersected with each anchor. Both the promoter and distal site 
for each interaction must overlap a given chromatin mark to be 
considered. 

Data access 

The data generated in this study have been deposited in the NCBI 
Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/ 
geo; Edgar et al. 2002) under accession number GSE42237. 
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